Goto

Collaborating Authors

 balanced softmax




Supplementary Materials for Balanced Meta-Softmax for Long-T ailed Visual Recognition

Neural Information Processing Systems

A careful implementation should be made for instance segmentation tasks. Firstly, we define f as, f ( x ) : = l (θ ) + t (24) where l ( θ ) and t is previously defined in the main paper.



Q1: Explanation about the mismatch (1/4 and 1) between the theory (Theorem 2 and Corollary 2.1) and practice

Neural Information Processing Systems

We will answer the major points below and address all remaining ones in the final version. We leave further discussions on the convergence rate to future works. Eqn.11 in [B] is generic (a superset of most loss engineerings like [3, 29, A]), it uses bi-level We will add a discussion on [3, A, B] in the final version. Q2: Meta sampler has a similar idea to [12,24,27]. CIFAR10-L T); theirs are instance-based and ours is class-based (fewer parameters and simpler optimization landscape).


Multi-scale Spatio-temporal Transformer-based Imbalanced Longitudinal Learning for Glaucoma Forecasting from Irregular Time Series Images

Yang, Xikai, Wu, Jian, Wang, Xi, Yuan, Yuchen, Wang, Ning Li, Heng, Pheng-Ann

arXiv.org Artificial Intelligence

Glaucoma is one of the major eye diseases that leads to progressive optic nerve fiber damage and irreversible blindness, afflicting millions of individuals. Glaucoma forecast is a good solution to early screening and intervention of potential patients, which is helpful to prevent further deterioration of the disease. It leverages a series of historical fundus images of an eye and forecasts the likelihood of glaucoma occurrence in the future. However, the irregular sampling nature and the imbalanced class distribution are two challenges in the development of disease forecasting approaches. To this end, we introduce the Multi-scale Spatio-temporal Transformer Network (MST-former) based on the transformer architecture tailored for sequential image inputs, which can effectively learn representative semantic information from sequential images on both temporal and spatial dimensions. Specifically, we employ a multi-scale structure to extract features at various resolutions, which can largely exploit rich spatial information encoded in each image. Besides, we design a time distance matrix to scale time attention in a non-linear manner, which could effectively deal with the irregularly sampled data. Furthermore, we introduce a temperature-controlled Balanced Softmax Cross-entropy loss to address the class imbalance issue. Extensive experiments on the Sequential fundus Images for Glaucoma Forecast (SIGF) dataset demonstrate the superiority of the proposed MST-former method, achieving an AUC of 98.6% for glaucoma forecasting. Besides, our method shows excellent generalization capability on the Alzheimer's Disease Neuroimaging Initiative (ADNI) MRI dataset, with an accuracy of 90.3% for mild cognitive impairment and Alzheimer's disease prediction, outperforming the compared method by a large margin.


Balanced softmax cross-entropy for incremental learning with and without memory

Jodelet, Quentin, Liu, Xin, Murata, Tsuyoshi

arXiv.org Artificial Intelligence

When incrementally trained on new classes, deep neural networks are subject to catastrophic forgetting which leads to an extreme deterioration of their performance on the old classes while learning the new ones. Using a small memory containing few samples from past classes has shown to be an effective method to mitigate catastrophic forgetting. However, due to the limited size of the replay memory, there is a large imbalance between the number of samples for the new and the old classes in the training dataset resulting in bias in the final model. To address this issue, we propose to use the Balanced Softmax Cross-Entropy and show that it can be seamlessly combined with state-of-the-art approaches for class-incremental learning in order to improve their accuracy while also potentially decreasing the computational cost of the training procedure. We further extend this approach to the more demanding class-incremental learning without memory setting and achieve competitive results with memory-based approaches. Experiments on the challenging ImageNet, ImageNet-Subset and CIFAR100 benchmarks with various settings demonstrate the benefits of our approach.


Balanced Meta-Softmax for Long-Tailed Visual Recognition

Ren, Jiawei, Yu, Cunjun, Sheng, Shunan, Ma, Xiao, Zhao, Haiyu, Yi, Shuai, Li, Hongsheng

arXiv.org Machine Learning

Deep classifiers have achieved great success in visual recognition. However, realworld data is long-tailed by nature, leading to the mismatch between training and testing distributions. In this paper, we show that the Softmax function, though used in most classification tasks, gives a biased gradient estimation under the long-tailed setup. This paper presents Balanced Softmax, an elegant unbiased extension of Softmax, to accommodate the label distribution shift between training and testing. Theoretically, we derive the generalization bound for multiclass Softmax regression and show our loss minimizes the bound. In addition, we introduce Balanced Meta-Softmax, applying a complementary Meta Sampler to estimate the optimal class sample rate and further improve long-tailed learning. In our experiments, we demonstrate that Balanced Meta-Softmax outperforms state-of-the-art long-tailed classification solutions on both visual recognition and instance segmentation tasks.


Balanced Activation for Long-tailed Visual Recognition

Ren, Jiawei, Yu, Cunjun, Cai, Zhongang, Zhao, Haiyu

arXiv.org Machine Learning

Deep classifiers have achieved great success in visual recognition. However, real-world data is long-tailed by nature, leading to the mismatch between training and testing distributions. In this report, we introduce Balanced Activation (Balanced Softmax and Balanced Sigmoid), an elegant unbiased, and simple extension of Sigmoid and Softmax activation function, to accommodate the label distribution shift between training and testing in object detection. We derive the generalization bound for multiclass Softmax regression and show our loss minimizes the bound. In our experiments, we demonstrate that Balanced Activation generally provides ~3% gain in terms of mAP on LVIS-1.0 and outperforms the current state-of-the-art methods without introducing any extra parameters.